To run this notebook, you will need to perform the following installations.
!pip install numpy
!pip install nibabel
!pip install plotly
We will use the following functions in this notebook.
import os
from demo.plot import *
In this notebook, we display the results of two mass-univariate Linear Mixed Models. These analyses were performed on surface data drawn from the Adolescent Brain Cognitive Development (ABCD) study and conducted using the Big Linear Mixed Models (BLMM) toolbox.
The experiment conducted in this study was a working memory N-back task and the response variable of interest was the percent BOLD (Blood Oxygenation Level Dependent Signal) signal; a measure of blood flow in the brain which acts as a proxy for neuronal activity. Prior analyses produced a 2-vs-0 back contrast image for each subject and session, reflecting the subject's average percent BOLD change in response to the 2-back task during a particular session. In each image, the average percent BOLD change is recorded for every vertex on a predefined cortical surface.
We are interested in understanding how a range of independent variables impacted the task-specific % BOLD reponse. The design matrix for both analyses included:
In total, the response data consists of 9835 fMRI surface images. These images were drawn from 5179 subjects, each of whom had data recorded for between 1 and 3 visits.
The two analysis designs differed in the random effects included in the model.
As random slopes can not be considered for singleton subjects, design 2 was constrained to consider only subjects with 2 or more visits.
The Linear Mixed Model can be represented in the form:
$$Y = X\beta + Zb + \epsilon, \quad \epsilon \sim N(0,\sigma^2I), \quad b \sim N(0,\sigma^2D)$$where, assuming the model includes $n$ observations, $p$ fixed effects and $q$ random effects, the model matrices are:
The random terms are:
Our interest lies in estimating the parameters:
Typically $D$ consists of only a few elements, so although $D$ may be large, we only need to estimate a few parameters.
The input and output of BLMM is labelled according to the above notational conventions.
To run a BLMM analysis for this example, the above model must be specified as a blmm_inputs.yml file. For these analyses, this will look something like the below:
Missingness:
MinPercent: 0.5
X: /path/to/X.csv
Y_files: /path/to/y_files.txt
analysis_mask: /path/to/analysis/mask
clusterType: type_of_computational_cluster
Z:
- f1:
design: /path/to/factor_design_matrix.csv
factor: /path/to/factor_vector.csv
contrasts:
- c1:
name: Intercept
vector: [1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
- c2:
name: Sex
vector: [0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
- c3:
name: Cross_sectional_age
vector: [0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
- c4:
name: Longitudinal_time
vector: [0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
- c5:
name: NIH_score
vector: [0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
outdir: /path/to/output_directory
In the above inputs file, the Missingness parameters specify what the BLMM code should do in the presence of missing data (here MinPercent: 0.5 tells BLMM to report results for any vertex with at least 50% of observations present. The X.csv file contains the previously listed dependent variables, concatenated into a matrix of size $(n \times p)$ and Y_files.txt is a text file containing a list of surface images. Also specified is an analysis mask, which tells the BLMM code which vertices in the image we wish to perform an analysis upon, and the type of computational cluster the code is to be ran on (e.g. Local, SLURM, SGE, etc).
The random effects in the model are specified by Z. As Z contains only one grouping factor (observations are grouped by subject only), there is only one entry in Z; the entry f1. To specify the subject-specific random effects, two files must be specified under f1. The first file is a csv containing the analysis design; that is, the matrix formed from the variables that should be stratified by subject (an intercept for design 1, and an intercept and longitudinal time for design 2). The second entry is an $(n \times 1)$ factor vector, which indicates which observation belonged to which subject. For instance, if the $5^{th}$ entry of the factor vector equals $2$, then the fifth observation in the analysis was drawn from the second subject.
Finally, the inputs file specifies an output directory and five contrast vectors for null hypothesis testing. In this example, these correspond to an intercept, sex, cross sectional age, longitudinal time and NIH score.
The results of the blmm analyses run using the above inputs can be found in the demo folder. The results have the following folder name conventions:
lh/rh: This indicates whether the analysis results are for the left or right hemisphere.des1/des2: This indicates whether the analysis results are for the first or second design.For each analysis, we have the following output files:
| Filename | Description |
|---|---|
blmm_vox_mask |
This is the analysis mask. |
blmm_vox_n |
This is a map of the number of input images which contributed to each vertex in the final analysis. |
blmm_vox_edf |
This is the error degrees of freedom*. |
blmm_vox_beta |
These are the beta (fixed effects parameter) estimates. |
blmm_vox_sigma2 |
These are the sigma2 (fixed effects variance) estimates. |
blmm_vox_D |
These are the D (random effects variance) estimates**. |
blmm_vox_llh |
These are the log likelihood values. |
blmm_vox_con |
These are the estimated contrasts. In our example, these were computed for the intercept, sex, cross sectional age, longitudinal time and NIH score. |
blmm_vox_conSE |
These are the standard error of the contrasts multiplied by beta. |
blmm_vox_conT |
These are the T statistics for the contrasts. |
blmm_vox_conTlp |
These are the maps of -log10 of the uncorrected P values for the contrasts. |
blmm_vox_conT_swedf |
These are the maps of Sattherthwaithe degrees of freedom estimates for the contrasts. |
To view our results, we will need some files representing the geometry of the brain. We shall use the fsaverage5 pial surfaces taken from the freesurfer software package.
# Geometry file names
geom_lh = os.path.join(os.getcwd(),'demo','geom','lh.pial')
geom_rh = os.path.join(os.getcwd(),'demo','geom','rh.pial')
To help read in the files we have provided a function that automatically generates the filename of the file you want to look at.
# Specify the analysis you want to look at (des1, des2)
analysis = 'des1'
# Specify the hemisphere you want to look at (left or right)
hemisphere = 'left'
# Specify the image type you want to look at (e.g. 'D' gives blmm_vox_D.dat)
image = 'beta'
# Get filename
data = get_fname(analysis, hemisphere, image)
We have provided the below function to view the results.
For images, that contain multiple volumes, the volume_number argument can be used to look at different volumes. For example, the con image contains 5 contrast estimate images which can be accessed by setting the volume_number to 0,1,2,3 or 4.
# Change volume number to view different images
plot_brain_surface(data, geom_lh, volume_number=1)
BLMM also allows for model comparison between the two designs. This can be performed in BLMM using the blmm_compare function. The files output in this case are given by:
| Filename | Description |
|---|---|
blmm_vox_mask |
This is the analysis mask (this will be the intersection of the masks from each analysis). |
blmm_vox_Chi2.nii |
This is the map of the Likelihood Ratio Statistic. |
blmm_vox_Chi2lp.nii |
This is the map of -log10 of the uncorrected P values for the likelihood ratio test. |
These results can also be viewed by setting analysis='compare' in the above code.
Note: To perform comparison, the analyses must have equal sample sizes. For this reason the above is actually a comparison between model 2 and a reduced version of model 1 which only contained the subjects with 2 or more visits.